On Identification of the Threshold Diffusion 

Processes 



Yury A. Kutoyants 

Laboratoire de Statistique et Processus, Universite du Maine 
72085 Le Mans, Cedex 9, France 

Abstract 

We consider the problems of parameter estimation for several mod- 
els of threshold ergodic diffusion processes in the asymptotics of large 
samples. These models are the direct continuous time analogues of 
the well-known in time series analysis threshold autoregressive (TAR) 
models. In such models the trend is switching when the observed pro- 
cess atteints some (unknown) values and the problem is to estimate 
it or to test some hypotheses concerning these values. The related 
statistical problems correspond to the singular estimation or testing, 
for example, the rate of convergence of estimators is T and not y/T as 
in regular estimation problems. We study the asymptotic behavior of 
the maximum likelihood and bayesian estimators and discuss the pos- 
sibility of the construction of the goodness of fit test for such models 
of observation. 
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1 Introduction 



The simplest example of the threshold model is the following threshold au- 
toregressive (TAR) time series: 

Xj+i = Qi Xj l{x,<'&} + Q2 Xj l{x,>^} + j = 0, . . . , n - 1, (1) 

where ej are i.i.d. A/'(0,s^), gi ^ Q2 and \Qi\ < 1. Therefore we have two 
different autoregressive processes depending on the region of observations 
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{x : X < i}} OT {x : x > i)}. This time series has ergodic properties with 
invariant density close to a weighted sum of two Gaussian densities. If we 
suppose that s'^, qi, Q2 are known and G 6 = (a, /3) is unknown parameter, 
then we obtain the first problem of threshold i) estimation. It is easy to see 
that the likelihood ratio is a piece wise constant (discontinuous) function of 

the Fisher information is equal infinity. As usual in singular estimation 
problems, the rate of convergence of maximum likelihood or Bayesian 
estimators is n and not y/n i.e.; the quantities n (j)n ~ and n (^{}n — 
have non degenerate limits. 

There are many different threshold regression models of such type exten- 
sively developed in econometrics and, of course, the identification of these 
models attracts attention of statisticians (see, e.g. the works by Quandt 
(1958), Tong (1990) [lOj, Chan (1993) [IJ, Hansen (2000) [B], Fan and Yao 
(2003) Koul et al. [llj, Chan and Kutoyants |2] and the references 
therein). Note that continuous time models actually find a wide range of ap- 
plications in econometrical problems and occupy a central place in financial 
mathematics (see, e.g., the work by Shreve [20]). 

Our goal is to study several models of continuous time analogues (diffusion 
processes) of such threshold type time series and to describe the properties of 
estimators of the thresholds for these models. Note that the general theory 
of parameter estimation (in regular case) for ergodic diffusion processes is 
actually well developped (see, e.g. |13], [21] and references therein) but the 
problems of threshold estimation are of singular type and need a special 
consideration. To illustrate these statements of the problem let us consider 
the following process 

dXt = -piXt I{x,<^}dt - p2Xt I{x,>^}dt + adWt, 0<t<T, (2) 

where Wt is Wiener process, pi 7^ p2 and pi > 0. We call it Threshold 
Ornstein- Uhlenbeck (TOU) process because it can be considered as a mixture 
of two different Ornstein-Uhlenbeck processes with switching. If we suppose 
that a,pi,p2 are known and -(9 e = (a, /9) is unknown parameter then we 
obtain the problem of parameter (threshold) estimation. 

It is in some sense similar to TAR ([1]) and the link between them can 
be clarified by the following consideration. Let us consider the discrete time 
approximation of the process ([2]) with tj = jS,j = 1, ... ,n — 1, where S = 
T/n, then we obtain 

Xt,,. = (1 - PiS) X,^ II{x,^<,} + (1 - P25) X,^ + a [W,^^, - Wt,] . 

This process coincides with ([T]) if we put Xj = Xt.,Qi = {l — pi6) and 
Ej+i = a [W^ij+i — Wtj] ~ A/" (0,(7^5) , i.e., = Hence, the regression 
model ([1]) is a discrete time approximation of the TOU process ([2]). 
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The threshold estimation problems for both models are of singular type 
and the limit distributions of the MLE's n{dn—^) and T{'dT—^) are of argsup 
type functionals of the compound Poisson and Wiener processes respectively. 

The process {Xt)^yQ has ergodic properties, the invariant density is a 
mixture of two Gaussian, the Fisher information is equal to infinity and we 
show that the maximum likelihood and Bayesian estimators converge to two 
different limit laws. 

We consider several other threshold type models of ergodic diffusion pro- 
cesses and study the asymptotic properties of the ML and Bayesian estima- 
tors. We discuss as well the construction of the goodness of fit tests for such 
threshold models. 

2 Threshold Ornstein-Uhlenbeck Process 

2.1 Threshold estimation 

We start with the TOU process 

dXi = -piXt I|x,<^}dt - p2Xt Il{x,>^}dt + adPFi, Xo, 0<t<T, (3) 

where we suppose that the following condition is fulfilled. 

Condition A. The constants Pi ^ P2, Pi > ^ o-nd cr^ > are known and 
the parameter "i? G O = (a, /3) , a > ^5 unknown. The initial value Xq is 
independent on the Wiener process random variable. 

The value = is excluded because in the case "i? = there is no jump 
in the trend coefficient and the properties of estimators are quite different. 

We consider the problem of estimation of the threshold d by the contin- 
uous time observations X'^ = [Xt, < t < T) and we are interested by the 
asymptotic behavior of estimators as T — > oo. 

Note that the conditions SS of the existence of solution and TZV of the 
ergodicity are fulfilled (see [13], Sections 1.1 and 1.2) and the process (Xj)^^^ 
has ergodic properties with the invariant density 

f x) = Pi (x, -d) e ^ + p2 (x, {}) e ^ . 

Here pi {x, ^) = G l^^K-d}, and p2 {x, = G {'&)^'^ ^{x>i}} and G (-i?) is 
the normalizing constant. To simplify the exposition we suppose that the 
random variable Xq has the density function f [d^x), hence the observed 
process is stationary. 

We are interested by the asymptotic behavior of the maximum likelihood 
and Bayesian estimators of the parameter t?, therefore we need the likelihood 
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ratio function L {{},X'^^. This function can be written as (see jl6j ) 

Jo Jo 

X^l{x.^,ydt-^J^ X^l[{x.>^ydt + \nf{^,Xo) 



The contribution of the term In f Xq) is asymptotically negligeable and 
we will always omitted it for simplicity of exposition (see the details in 

The MLE and BE (for quadratic loss function) ■^j' are defined as usual 
by the relations 



t\ / tn ~ f'^ 9p(9)L(9,X^) d0 
l(^t,X^] =snpL(9,X^) and = „ — • 4 



To describe theirs properties we need the following notations. Let us intro- 
duce 

• the random process 

Zo(n) = exp|w^(n) - ue^, 

where W {■) is two-sided Wiener process, 

• two random variables u and u defined by the relations 

Zq [u) = sup Zq [u), u= r ry I , . (5) 



the function 



p2 _ {P2-Plf 



The properties of estimators are given in the following proposition. 

Proposition 1 Let the eondition A he fulfilled, then the MLE and the 

BE are uniformly on compacts IK C 6 consistent: for any v > Q 



sup <^ dr-^ 
have two different limit distributions 



theirs moments converge: for any p > 



E 



u 



E, 



T 



E 



u 



For the proof see Section 6. 

Note that the same normahzation and the same type hmits (with different 
r.|j) we have in the problem of delay estimation by the observations of the 
following Gaussian process 



See details in 



dXt = -pXt-^ dt + adWu 0<t<T 
(or in P^, Section 3.3). 



Remind that the Bayesian estimators are usually asymptotically efficient 
in singular parameter estimation problems [7j. The following lower bound is 
valid: for all estimators "^t 



Eu 



r,2 



lim lim sup E^ {'&t - ^) > ^ 

(5-s>0T-s>oo |,?-i9o|<(5 r('i9o) 

see [7j, Section 1.9 (or [13], Proposition 2.24). We call an estimator 
asymptotically efficient if for all "^o & Q we have the equality 



lim lim sup T^E^i^*^-^)' 



Eu 



r,2 



<5->0 T~>oo 



|i?-i?o|<'5 



It can be verified that the convergence of the moments of bayesian estimators 
is uniform on the compacts in G and that the function 7 (^9) is continuous. 
From these properties we obtain immediately the asymptotic efficiency of the 
bayesian estimators (in the sense of this lower bound). 

The quantities E-u^ and Eu"^ were calculated by Terent'ev (1968) and 
Rubin and Song (1995) respectively 



Eu^ 



26 > Eu^ = 16C (3) ~ 19, 3 



where ( {■) is Riemann zeta function. This relation shows the difference 
between the limit variances of the MLE and BE. 
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2.2 All parameters unknown 



It is possible to describe the properties of estimators in the case when all 
three parameters {pi,p2,'^) = {'&i,'&2,'&3) = i? G © are unknown and we 
observe 

dXt = -^,Xt l{x,<&,}dt - {)2Xt l{x,>^,}dt + adWt, 0<t<T. (6) 

We have = x (02, /32) x {as,^^). Let us denote by ^ the random 

variable with the density / (t?, x). 

Proposition 2 Suppose that (3i < ci2 and 0L2 > 0, then the MLE 'dr, BE 
'Ot are consistent, have the following limit distributions 
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T (4,T - ^3) ^ ^, T (^3,r - ^3) = 



u 



The BE i9i^T,'&2,T have the sam,e asymptotic properties as i9i^t,'&2,t, the ran- 
dom variables and (2 are independent and are independent ofu,u. 



The proof see in Section 6. 

The construction of the MLE can be slightly simplified by the following 
"separation" . 

The MLE of the first two components can be written as 

but to study these expressions can be quite difficult because the estimator 
i?3^T depends on the whole trajectory and therefore the random function 
Xfl^^^^^^^y < t < T depends of the "future". Hence the stochastic 

integral needs a special treatment. The problem can be simplified as follows. 
Let us estimate the parameter by the first X^ — <t < 
observation and denote by ^ the corresponding consistent estimator. We 
suppose that there exists 6 > such that 



> T-H ^ (7) 
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as T — > oo. Then we define the estimators 



Now the stochastic integrals are well defined and the consistency and asymp- 
totic normality of these estimators follow from the usual limit theorems, i.e., 
we have 

with (law of large numbers) 
and (central limit theorem) 

Note that the independence of the random variables and ^2 follows from 
the following property of stochastic integral 



hence 



E, 



The possibility to simplify the estimation of ■j?3 we discuss at the end of the 
next section. 



2.3 Misspecification 

Let us return to the initial problem of threshold estimation and suppose that 
the observed process is 

dXt = -p^Xt ir{x,<^o}di - p2Xt I{x,>^o}d^ + h {Xt) dt + adWt, (10) 

where h{ ) is some unknown function (contamination) and i^o is the true 
value. We assume that the statistician uses this model without h{-) (wrong 
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model) and tries to estimate i), i.e., he (or she) supposes that the observed 
process is TOU (E]) and construct, say, the MLE 'd^ a.s if h{-) = 0. Then he 
substitutes the observations (fTOl) (of course, containing h (■). Such situation 
can be considered as typical for many applied problems, when there is a 
difference between the theoretical model and the real data. Remind that in 
regular case the MLE and BE are usually not consistent and converge to 
the value which minimizes the KuUback-Leibler distance (see [13], Section 
2.6.1). The KuUback-Leibler distance in our problem is (suppose for instant 
that ^0 < ^) 

dP* 

Dk-l (^, ^o) = In (X^) 
T 

T 

= ^E^o [(Pi - P2) e I{^o<?<^} + h (0] ' 

where Ej^,^ denotes the expectation w.r.t. the measure Pj^^ which corresponds 
to the process (|T0|) (we denote its density as fh {^o, x)). It can be shown (see 
[13], Section 2.6.1) that 

— > = arg inf Dk-l (^, ^o) ■ 

We are interested by the following question: when "(9^, = "^o, i.e., when the 
MLE is nevertheless consistent? Surprisingly it is possible even for not too 
small functions h (■). Suppose, for simplicity, that -i? G O = (a, /3) , a > 0. 
Let us introduce the function 



[(pi-p2)eVo«^} + /^(0]', if^>^o 

E*^ [(p2-pi)eV«^o} + /^(0]', if^<^o 
and suppose that p2 > pi- Then for -(9 > t^o we have 



h{x f fh{^o,x) dx 



and 

dK (^9, ^9o) 



^0 

+ 

■oo Ji9 
r 2 

+ / [{pi ~ P2) X + h (x)] fh{^o,x) dx, 



-h h (^0, ^) + [(Pi -p2)^ + h h (^0, ^) 



= [(Pi - P2)' + 2 (pi - P2) {d)] h (^?o, ^) . 



8 



Therefore, if 

y 

h{y) < ^ {p2 - pi), for a<y < (3, 

then for d > 
and similarly, if 

y 

h{y) > -^{P2 - Pi) , for a<y</3, 

then for < i^o 

We see that if the function h{-) satisfies the condition 

\h{y)\<^{p2- Pi), a<y</3, (11) 

then -d^ = -^o and the MLE -dx is consistent even for this "wrong model" (see 
|13] . Section 3.4.5 for another example). Note, that there is no conditions on 
h{y) for y ^ 

Let us return to the problem of the construction of the preliminary con- 
sistent estimator of the parameter by observations Suppose that 
/5i — «! < a2 — /3i and — «2 < «2 — f^i- Let us put 

5 ai + /3i 3 a2 + (32 
^ 2 2 

and consider the problem of estimation i}^ by the "wrong model" 

dXt = -&^Xt ll{x,<^3}dt - I{x,>^3}dt + crdWu 0<t<VT 
with "known" ^1,^2- This corresponds well to the model ffTOl) with 

h (x) = - t9i)xl{^<^3} + {^2 - ^2)xl{x>i)3}. 

We see that the condition ffTTl) is fulfilled, hence the MLE ^ is consis- 
tent and can be used in the construction of the estimators ([8j). Note that 
the estimator i)^ ^ even has "singular" rate of convergence, but its limit 
distribution is different of that of the true MLE. 
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3 Other Threshold Models. 



Below we consider several other threshold type ergodic diffusion processes 
and discuss the properties of parameter estimators for these models. 



3.1 Simple Threshold model. 

Suppose that the observed process is 

dXt = pi l{x,<^}dt - p2 ll{x,>^}dt + adWt, 0<t<T, (12) 

where Pi > and ^ G Then this process is ergodic with exponential 

type invariant density 

/(^,-) = ^exp| 

where p(x,'i9) = pilL{x<-&} + P2'^{x>'&} and G is the normalizing constant. 
The MLE -t^T and BE have the same properties as in Theorem [1] 
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and the corresponding function 

.2 2p2Pl(P2+Pl) 



r 



Note, that the normalized LR converges to the limit process as follows: 



Zt {u) = ^ ^ll^^^P ^ {r. W{u)-^ Tl 

The proof see in the Section 6. 



3.2 Simple Switching. 

Suppose that in the model (IT^ we have pi = P2 = P > 0. Then the observed 
process is 

dXt = -p sgn {Xt -^)dt + adWt, < t < T, (13) 

where {} E Q = {a, (3). This Simple Switching Process was studied in [T3] . 
Section 3.4.1. Remind that it has Laplace type invariant density 

f{i^,x) = — e ^' 
10 



The likelihood ratio formula has the representation 



L X^) = exp I sgn (X^ - ^) dX^ 



Hence, the MLE -(9^ is defined by the equation 

/ sgn f Xi - dXi = inf / sgn (X^ - ^9) dX^. 

Note that the last stochastic integral we find in Tanaka- Meyer representation 
of the local time of diffusion process (see jl^ ) 

ATm = \XT-^\-\Xo~^\- [ sgn(Xi-7?)dXi 

and the maximum likelihood is in some sense asymptotically equivalent to 
the maximum local time estimator. Remind that /|, (x) = Aj- (x) /Ta"^ is the 
consistent, asymptotically normal and asymptotically efficient (in nonpara- 
metric statement) estimator of the invariant density (see pjij for details), and 
we have obviously 

sup/(^9o,^?) = /(^o,^o). 

We have the same asymptotic properties of the MLE and BE as in the 
Theorem [H 

The normalized LR 

W = — r — =^ exp ^T^W [u) - — T^} , T^ = —. 



L{^,XT) ^ l^'' 2 '^j ' a' 

The proof can be found in Section 3.4. 

The observation window (— oo, oo) can be essentially reduced. Let us put 



Note that i)*^ is an estimator of the method of moments (E^^ = i)). It is 
consistent and asymptotically normal 



(^^^^_^^) ^Ar(o,rf2(^)), 
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see [13], p. 270, where d{^'^) is calculated. Introduce the window 



El 



The MLE and BE we define with the help of the LR L (^^, X 



T 

VT 



exp 



VT 



2a2 



VT 



Then these estimators have the same asymptotic properties as if the obser- 
vation window is Ex = (—00, 00). 

This a bit surprising result is probably typical for singular estimation 
problems. The analyse of the proof of the properties of estimators (see [13], 
Section 3.4) shows that only the values of Xt close to the true value -do have 
contribution to the limit likelihood ratio. Hence all other observations are 
irrelevant and can be deleted by introducing this window. 



3.3 Multy Threshold O-U Process. 

Suppose that the observed process is 
fe+i 

dXt = -J2 Pi 2{^^,_i<x,<^adt + adWt, 0<t<T, (14) 
1=1 

where pi > 0, pk+i > 0, pi pm > 0, ^0 = -00, ^k+i = 00 and = 
(t9i,...,t9fc) e = 01 X . . . X Gfc, = (az,A), A < "i+i- Then this 
process is ergodic and the normalized likelihood ratio (u = {ui, . . . , Uk)) has 
the following limit 

= ^ ^ ^ (-) = n-P {r. w, {u{) - ^ r?} , 

where Wi{-) are independent two-sided Wiener processes. The estimators 
'Ot = {^i,T-, • • • , ^k,"!^ and i9r = ^"^^i.r, • • • , ^k,T^ are consistent, have asymp- 
totically independent components, 

i.e.; {ui,ui) is independent on {um,Um) if / 7^ m and the moments converge. 
The proof see in the section 6. 
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4 General Threshold Model. 

Suppose that the observed diffusion process X-^ = {Xt, < t < T} satisfies 
the equation 

fc+1 

dX, = J2 (Xt) I{^,_,<x.<^,}dt + a (Xt) dWt, Xo, (15) 
i=i 

where "i^o = — oo, G Qj = {aj,f3j),j = l,...,k, 'dk+i = oo, /3j < ttj+i. 
The unknown parameter is ■J? = (-i^i, . . . ,'dk) G = 6i x . . . x Q^. Our 
goal is to estimate i? and to describe the asymptotic properties of estimators 
as T — )■ oo. As before, we are interested by the estimators obtained by the 
Maximum likehhood and Bayesian methods. 

This model can be called "Nonlinear Threshold Diffusion Process". Of 
course, all considered above models are nonlinear due to the indicator func- 
tions. Here we use the term "nonlinear" because the linear function px in 
the trend coefficient —px !{.} is replaced by more general function S (x). 

SS. The functions Sj{-) are locally bounded, the function cr(-)^ is contin- 
uous and positive and for some A > the condition 

xSi (x) l{x<ai} + xSk+i (x) ]I{x>/3fc} + o- {xf < A [l + x^) (16) 

holds 

This condition provides the existence of unique weak solution (see j^). 
We suppose that all measures |p^\i9 G ©j induced by this process 

in the space (C (0, T) , i3 (0, T)) are equivalent to the measure P^'^\ which 
corresponds to the process 

dXt = a {Xt) dWt, Xo, < t < T 

(see [IS]). The likelihood ratio 

in this problem is the random function 

fc+1 

a{Xt 



InL (^,X-) =Y^I ^ V^_,<.,,.,,dX, 



i=i 



^-^-^l^ /" S ■ (X ) 

j^lJo ^(T (At) 
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The MLE "Ot is defined by the same equation 

L (dr, X^) = sup L {e, X^) , 

where the function L (i?, X^^ is not differentiabie with respect to "d. 
Note that 

^{'&j-i<x<^j} = ^{x<^j} — fl{a;<i?j_i}- 

Hence 

fc+i fc+i fc+i 

{x) %^_i<a:<tf,} = {x) lf{x<i?^.} -^Sj {x) ]f{a;<^^_i} 

j=l j=l j=l 

k 

= Sk+1 (x) + ^ [Sj (x) - Sj+i (x)] ll{x<^^} 
i=i 

and we can write the hkehhood ratio as product of A; + 1 "hkelihood ratios" 
L X^) = ^ (X^) = L,^, (X^) n L, X^) , (17) 



where 

dt 



and 



Jo ^ [-^t) 

^ [Si (X,f - S,+, (X,f] 



This allows us to reduce the calculation of the MLE ■Ot of multidimen- 
sional parameter t? to one-dimensional problems : 

^i,T = argmax^.ge. Lj {'dj,X^) , j = l,...,k, 

and to put -dr = (J^i,t, ■ ■ ■ , '^k,T^ ■ 

To introduce the Bayesian estimator wc suppose that is a random 
vector with a known continuous positive density a priori p {0) ,0 & Q and the 
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loss function I {u) , u G is strictly convex. The estimator t^t is defined 
as solution of the following equation 

Eei (dr -e^ p (0) dO = mf^ Bgi {-d - 0) p (6) d0. 
Remind that in the case £ (u) = 1^1^ this estimator is 

~ _ j^eL {e,x^)p{e)de 
^ j^Li0,x^)pi9)de ■ 

In this case we can simplify the calculation of the estimator too. Suppose that 
the density p (0) = pi (6'i) ■ ■ - pk {Ok) (the components of i? are independent 
random variables). Then, using f ll7p . we can write 



^^g,L,(g,,X^)p,(g,) dO, 

LL,{d,,x^)p,{d,) de, 



and then to put ■dx = {^i,t, ■ ■ ■ , ^k,T^ ■ 

The asymptotic behavior of the diffusion process is defined by the follow- 
ing condition. 

A. The functions Si (x) , Sk+i{x) and a (x) satisfy the conditions 

\(j{x)\-^ < 5(1 + IxD 

with some B > and m > and 

V Si (x) — Sk+i (x) 

lim — > 0, lim — \ < 0. 

x^-oo a (x) ^•^'^ a (x) 



By this condition the process {Xt)^^Q has ergodic properties. Let us de- 
note by / x) the density of its invariant law and by ^ the random variable 
with such density function. Note that by this condition ^ has all polynomial 
moments [13]. 

The identifiability condition in this statistical problem is the following 

one 

inf \S,{y) - S,^i{y)\ > 0, j = 1, . . . , fc. (18) 
Let us introduce ii^ = (mi ,9, . . . , Uk,'d), where 

- ^ ' 
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and Ml, 



, Uk are independent random variables defined by the equalities 



Uj = argsup„g. 



W, (u) - - \u\ 



Here Wj {■) ,j = 1, . . . ,k are independent two-sided Wiener processes. 

Let us define the random vector as solution of the following equation 

/ i {u^ — u) Z (u) du = inf / £{v — u) Z (u) du, 



where 



Z (u) = exp <j ^ 

. i=i 



7, m W, [u,] 



-if 7j (^) 



(19) 



Theorem 1 Suppose that these conditions £S, A and (|T8l) are fulfilled, then 
the MLE -dx O'nd bayessian estimator "dx o^^e consistent, have the following 
limit distributions: 



Ti'dr-'d 



and the moments converge : for any p > 

p 



lim TP 

T-5>oo 



The proof is given in the section 6. 



Elu^P, limT^'E^ 



E \u^\ 



5 Proofs 

First note that the parameter estimation problems for the models of the 
observations ([3]) , (|T2|) and (|T3|) are particular cases of the threshold estimation 
problem for stochastic process (fT5|) . Therefore, it is sufficient to prove the 
Theorem [TJ 

The proof of this theorem is based on the two remarkable theorems by 
Ibragimov and Khasminskii (^, Theorems 1.10.1 and 1.10.2) and some re- 
sults obtained before in [13]. Let us remind the main steps of this approach. 
Introduce the random function (normalized likelihood ratio) 



L(^9,X^) 



(*t) = — Vrir^fFr^) u eUr = Ui^t ^ ■ ■ -Uk^r, 
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where Uj^t = (T {aj — , T {f3j — ^j)). The properties of estimators follow, 
roughly speaking, from the weak convergence of this function to the limit 
random field (fT9i) : 

' L{'d,XT) ^ ^ 

Suppose that we have already this convergence and (for simplicity) assume 
that k = 1. Then for the MLE we have (-i? is the true value) : 

P,9 It({^t-^) <x 



p{ sup L {e, x^) > sup L {e, x^) 

T{e--d)<x T{e--d)>x 

L{^,X^) L{^,X^) 
P \ sup > sup 

T{e-i})<x ^[v,y^ ) T{e--&)>x ^ [Vo,yi ) 

P ^ sup Zt (u) > sup Zt (u) 

u<x u>x 

— ^ P i sup Z {u) > sup Z (u) \ (20) 

\^u<x u>x J 

< x] , i.e. T (dr - ^] ^ 

where we put 6 = i} + T~^u. 

To describe the behavior of the BE we take we for simplicity the square 
loss function and use the same change of variables 6 = d + u/T = 9u and , 

~ _ Op (9) L {9, X^) d9 _ 1 /^^ up (g.) L {9^, X^) du 
jyi9) L {9, X^) d9 T J^^ p (9.) L {9^, X^) du 

l(8 X^) 

1 lu^ up i^u) l^llxT) _ 1 /^^ up {9u) Zt {u) du 
Iu.PiOu)^du ~ J^^pi9.)Zr{u)du 

Then, using the convergence p {9u) — )■ p we can write 

r,r u p (9u) Zt (u) du 



IurP(^^)^T{u) du 



ff,uZ (u) du 



u 



P S r r.\ . < a; ^ = P ^ <x\. (21) 
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The random variables u and u are defined in 

We see that to prove the theorem we need to prove the convergences fl20l) . 
(I2T]) . These convergences together with the estimates on the large deviations 
of estimators will provide the convergence of moments. The corresponding 
sufficient conditions are given in the mentioned above theorems by Ibragimov 
and Khasminskii. Let us introduce the conditions 

A. The finite dimensional distributions of the random function Zt (■) con- 

verge to the finite dimensional distributions of the function Z {■). 

B. There exist constants B > 0,m > 0,b > and d such that for any R > 

and \u\ < R, \v\ < R 

2m 

<B[l + R^)\u-v\'^. (22) 



V 2m 



1 

V 2m 

Zjrp 



V 



C. For any N > 0, there exists constant Cn > 0, such that 

(n) < (23) 
\u\ 

These conditions are the version of the conditions of Theorems 1.10.1 
(with d > k) and 1.10.2 [7], which we will verify in this work. 

We start with the condition A. Let us consider the case when all Uj > 
and denote hj (x) = Sj (x) /c {x). Note that 

in obvious notation. 

Then the likelihood ratio Zt {u) can be written as follows 

\YiZT (n) = V / {Xt) - ]I{B,_,}] dm 

^ i=l 



Using the local time estimator (x) of the invariant density / (i?, x) we 
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write 



/oo 
[hj {x) - hj+i ix)f ]I|^^<^<^^.+^| fr (x) dx 

= T / [hj (x) - hj+i (x)] (x) dx 



^ 2 



T / [/i,(x)-/i,+i(x)]V('^,^)cia; 



+ T / [h, {x) - (a;)]^ [/^ (x) - / x)] dx. 

For the random function r/j- (x) = T (/^ (a;) — / a;)) we have the estimate: 
for any p > there exist constants > and c* > such that 

E^|77T(x)r <C,e-^*l^l (24) 



see Proposition 1.11 in [13]. This estimate allows us to prove that the last 
integral tends to zero as T — )■ oo. We have as well 



T / [h,{x)-h,+,{x)Y f{^,x)dx 
Therefore, 

k k 

This convergence by the central limit theorem for stochastic integrals yields 
the asymptotic normality of the vector ^j. = (^i,t, • • • , ^k,T) 

e,,T = r [h, (Xt) - (X,)] I{B,}dW^i =^ Af (0, M,7, i^f) 
Jo 

with asymptotically independent components, because 

^^QHl,T = 0, J. 
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Moreover, if we put t = S,j^T (uj) and consider the vector ^^j q. = {C,j^T ? 
. . . , C,j^T (uj^n)), where Uj^i, . . . , Uj^n is some collection of values from Uj^t, then 



E^O,T (Mj.r) =T [hj (x) - hj+i (x)] / x) dx 

Using this equality and preceding limits we can show the convergence 

Km) , . . . , ij,T (Mi,n)) =^ Ij (^) (W^i , • • • , W^i • 

Therefore the condition A is fulfilled. 

To verify B we do it twice. The first time we check this condition with 
m = 1, which is sufficient for Bayes estimators (multidimensional case) and 
then (for MLE) we verify it for the partial likelihoods Zj^T{u). Following 
[13] . Lemma 3.28 we write (we suppose that vj < uj) 



ryl/2 j 

[U 



j=l -'o 



dt 



1 

4 ^ 

i=i 

k 



T 



"3+ T 



[/ij- (x) - hj+i {x)f (x) dx 



< C |uj — fjl < C \\u — v\ 



(25) 



Here E^, and /* (■) are expectation and invariant density which correspond 
to the stochastic differential equation 



fc+i 
i=i 

+ aiXt)dWt, Xo, 0<t<T 



dt 



(see details in [13], p. 379). The notation Ej is clear from the second line of 

The condition B in the case of the study the MLE we check for the 
components Zj^t (uj) ,Uj G Vj^t separately as follows. Let us introduce the 
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stochastic process 
and denote 



9j (^) 



Sj (x) - Sj+i {x) 



a{x) 



Then the process 



Vj^t = exp 



16 



- / 

32 70 



1 



{^j+^Jr<Xs<i)j + ^} 



ds 



by Ito formula admits the representation (under measure 



16 
15 
512 



I 1, — ) V+^<^.<^.+^} 



Remind that 



k+l 



Therefore we can write 



< (e^Z;,/^^ {vj)^ (E^ |1 - V;-,Tn'^' < (E^ |1 - Vj,Tf) 



.1/16 



a/4 



1/2 



because E^Z^^ (vj) < 1. Further 



E^ |1 — Vj^tI < Ci E^ 



di 



(26) 
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For the last (stochastic) integral we have the estimates 



< 



CE^ sup VfAfg.^X.f 

0<t<T \Jo ^ 



< C ( sup 



0<t<T 



1/2 



E^ (^^ g, {Xtf 1 



dt 



8\ 1/2 



Remind that V^^^ is martingale and E^V^® = 1. Using once more the local 
time estimator of the density we write 



9j (Xtf 1I|^^.+-L<x,<^,+;L}di = T 



75 -1-^ 



9j {xf fT (x) dx. 



Hence 



E^ (^^ g, (Xtf 1 



< (uj — Vj)' T I gj (x)^^ E^/^ (x)" dx < C [uj — Vj) 



i9 -t-^ 



The expectation E^/^ (x)^ due to the estimate fl2^ is a bounded function. 
For the first integral in the similar calculations yield the estimate 

(^^ gj (Xtf lI|^^+^<;,^<^^+^| dt) < C {uj - Vjf . 
Therefore, for \uj\ < R,\vj\ < R 



E, 



<C{l + R^) \uj-Vj\\ 



.1/16 



(27) 
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To verify condition C we follow the proof of the Lemmas 3.29 and 2.11 
in [13]. By condition f[T5]) we have 



with some positive constants Here d = {Si,...,Sk) and we suppose 

for simplicity that all 6j > 0. Hence the inequality fl23|) follows from the 
mentioned above lemmas. 

The properties of BE follow from the Theorem 1.10.2 in [7] because the 
conditions A, f l25|) and f l23|) are sufficient for this theorem. 

For the MLE we do not apply directly the Theorem 1.10.1 in [7] be- 
cause it requires in condition B that d > k. We follow the modification of 
this theorem discussed in the proof of the Proposition 2.40 in [13]. Let us 

consider the vector of likelihood ratios Y (li)^ = (^Zl^,^ (ui) , . . . , Zl''^^ (wi) j . 

For the components Z^j! (uj) ,j = 1, . . . ,k we have the joint convergence 
of its dimensional distributions to the distribution of the limit random field 
Y (u) = (^Z^'^ (ui) , . . . , Zl^^ {ui)^ with independent components and the 

conditions B and C. Therefore we have the tightness of the corresponding 
vector of measures and for each component we have the large deviations 
estimates: for any L > and > there exists C^- > such that 



These estimates and the factorization of the likelihood ratio f|T7|l allows us 
to finish the proof of the properties of MLE mentioned in Theorem [H Note 




k 





that the MLE -^j^T can be written as 



^i,T = argmaxg^ge, {^v^^) 



too. 
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To prove the Proposition [2] we consider the normahzed hkehhood ratio 
(we take u > 0) 

In Zt [v,w,u) = In — ^ 



+ (^2 - ^1 + 



Jo ay L Jo 



rp 

W — v\ \ 



T r 



I{Xf<tf3} J^^{Xt>^z} 



T 

-\ 2 

XlAt 



= t;Ai,T + WA2,T + \ 7=- ^3,T [u) - - Jt, 

V cr crVT / ^ 

where the last equahty introduce the notation for these integrals. For the 
last integral we can write 

,2 /"T „,,2 /-T 



(J Jo <~> J- Jo 



For the first two integrals by the law of large numbers we have 



1/2 2 

^ / 2{Xt<tf3} — ^ E^^^I1{5<^3}, (29) 

1 /"^ 2 2 

T J ^i^t^^s} — > E^^^I{g>ij3}, (30) 



and for the last one using the local time estimator of the density we obtain 
X^I . . dt = T / \^f^{x)dx = T (id, x) dx 

+ T / (/°(a;)-/(t9,a;)) da; = n / (^, ^3) + « (1) , 

J^3 

where in o (1) we used once more the estimate ([23]). Therefore 

^2 ^2 _ ^/)2^2 
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For the stochastic integrals Ai and ^2,t from (12^ . (I5U1) and by the central 
limit theorem we have the convergence 

Ai,T ^ Ci ~ AT (0, Ii) , Ii = ^ E^^2 j^^^^^^ (3^) 

A2,T ^ C2 ~ A/" (0, 12) , I2 = ^ E^e'l{s>^3}, (32) 

where the random variables Ci and (,2 are independent. 

Let us consider At = AiAs^t (^i) + ^2^z,t (^2)- We have 



dm. 



Note that 



T 







1 2 



T 







[Ui \l + U2\l + 2A1A2 (Ml A M2)] ^3 / (^, ^^3) = C/'. 

Hence Ay is asymptotically normal A^ ^ A with the limit variance (P. 
Remind that the same variance has the random variable 



A = \,d^^J{^) W (Ml) + Aa^^s/W^ W {U2) , 

where W {■) is a Wiener process. Therefore we have the convergence of the 
finite dimensional distributions of A3 t {u) to the finite dimensional distribu- 
tions of the process ^■i^J f (i?, ^■i) W (u): 

(A3,T (mi) , . . . , A3,r (uk)) 



^sVfi'^.^s) W (Ml) , . . . , ^93/m^ W (uk) (33) 



This convergence together with ( 13T|) and ( 132|) allows to write the likelihood 
ratio random field as 



Zt (m, w, u) = exp <J mAi_t - —li + w/^2,t - —h 
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where Ai^^^ and A2,t are asymptotically normal, and 



Therefore we have the convergence of the finite dimensional distributions of 
Zt [v, w, u) to that of the random function 



Z {v, w, u) 



where Ci, C2 and W {■) are independent. 

To check the condition B in the case of Bayesian estimation we following 
fl2^ write (m2 > ui > 0) 

2 



< 



Z^^^ {Vi, Wi,Ui) - Z^^"^ (f2, W2, U2) 
1 



4ct2 



- ^72) ]I{Xt<^?3} , (wi - W2) ]I{Xt>^3} 



T 



t;2 - t^i - ^^2 + wi , ^ 



< Ci {Vi -V2f + C2 {Wi - W2 )^ + C3 |m2 - Mil . 

In the case of MLE this estimate is not sufficient because the condition 
d > 3 is not fulfilled. We slightly modify the proof of (1271) . Let us denote 
u = (t>, w, u) and put 

1 

' Zt (v2,W2,U2] 



Vt 



Zt {vi,wi,ui) 



Then 
E,< 



{Vi,Wi,Ui) - {V2,W2,U2) = E^Z^ {V2,W2,U2) 1 1 - Vr 



< (E^Z^V2,W2,U2)) (E^II-^tI < (E^|1-Ft| 
The process Vt,0 < t < T hj ltd formula admits the representation 

VT = l-a VtiAS iXt)f dt + b Vt (AS (Xt)) dWt 
Jo Jo 

with corresponding constants a > and b > and AS (Xt) = ASt is the 

difference of two trend coefficients. Hence 

rT \ 16 . „T \ 16 

E^|1-Vt|''<AE^( / VtiAStfdt] +BE^( VtiASt)dt 











< 
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Further 



[ {^Stfdt] < sup ( [ (AStfdt 

\Jo / 0<t<T \Jo 



< ( sup 

0<t<T 



24\ \ 



24\ 3 



E^( / {AStfdt 

because E^ supo<t<T' Vt^^ < 1. Now with the help of ([28]) we can write 
E^ (^j^ {AS {Xt)f d?j = E^ (^T (A^ {x)f /° (x) dx 

After substitution of these estimates we obtain 

E^ Z^^ {Vi,Wi,Ui) - iv2,W2,U2] 

< A\V2 — Vif + B \W2 — Wif + C \U2 — . 

Therefore for the values + + < Rwe have 

E^ {Vi, Wi, Ui) - Z^"" {V2, W2, U2) 

<C{1 + R") {\V2 -Vi\''+ \W2 - Wll' + \U2 - Ult) (34) 

Hence the condition B is fulfilled with m = 4 and c? = 4 > 3 for the random 
field Yt (f , w, u) = Z^ (f , w, u). 

To verify the condition C we follow the proof of Lemma 2.11 in We 
write {u > 0) 

y2 fT ^ r 2 

^■&Jt = J ^■&Xt'^{Xt<i}a}(it + J E^Xj^I{Xt>i?3}di 



+ 
+ 2 



a 



V — 


W \ 




(TV 




/ 




V — 


w 




+ — 





27 



Note that 



- «2 - /3i 1 

< K = < — 

a a 



n W f ^ V 



a 



Hence 

0" 0" Jfc 



\w\ /"'«+^ 



Let us put 6 = K^/AK, then for ^ + + ^ < 5 we have 



^2 

E^Jt>i;^Ii + w2I2 + |m|— inf f{'&,x), (35) 

2 a3<a;<,93 

and for the vector h = (/ii, /i2, /^s) with hi = :^,^2 = :^)^3 = ^.nd 
II ^11 > (5 we can write 



E^ Jt , 

n 



/173 /"OO 
x^/ (i^jx) dx + / x^/(i9,x)dx 

+ i'&i + hif I x^f {'&, x) dx 

x'^f{'d,x)dx + hl / xV(^,a;)dx 

-oo J I3z 

+ {a2-l3if xV a;) dx > Ki > 0. (36) 

Here we used the representation 

— t^ll{x<i93} ~ '^2^{x>'a3} + (''^l + ^l) ^{xKiis+hs} ~ (^2 + ^2) 2{x>i93+/i3} 
= /''ll{z<l93} + /''2l{x>1?3+/l3} + {"^1 — ^2 + hi) l{^.^<:x<i)3+h'i}- 

Now having fl35l) and fl36|l we can follow the proof of Lemma 2.11 in [13] 
and obtain the estimate fl23|) . 

The properties of modified (simplified) estimators defined by the equal- 
ities ([H]) will be proved if we verify the law of large numbers For any 



28 



e > using the consistency we can write 



T 



^3,T - ^3 



> T- 



<P^ { sup 



>e}+o{l) 



Further 



sup 

\e-d3\<T-b 



T 



^ l^^t^{x.<e}dt-li 



< 



sup 

sup 

\e--d3\<T-'' 



^■3 3.2 



cr 



1 fT ^ 



^3 3.2 



— (x) dx- —f (i?, x) da; 



[/^(a;)-/(79,x)]da; 



+ 



^3+T''' ^2 



^3 



f:^ (x) dx + o (1) . 



To finish the proof we just mention, that 

/1?3 2 
^[/°(x)-/(^,x)]dx^O 
00 ^ 

by the law of large numbers. 

6 Goodness of Fit Testing 

Remind two well known goodness of fit (GoF) tests of classical statistics |15] . 
If we observe n i.i.d. random variables (Xi, . . . , X„) = X" with distribution 
function F (x) and the basic hypothesis is simple 

J^o, F (x) = F4x), xe ^, 

then the Cramer- von Mises (C-vM) and Kolmogorov-Smirnov (K-S) statistics 



are 



Fn (x) - F, (x) 



dF^ (x) , D„ = sup a/w 

X 

29 



Fn (x) - F, ix) 



respectively. Here F„ (x) is the empirical distribution function. 

For continuous (x) under hypothesis we have the convergence 



sup I Wq (s) 

0<s<l 



where Wq (■) is Brownian bridge. The limit distributions do not depend 
on the model -F* (■) (the tests are asymptotically distribution free) and this 
essentially simplifies the choice of the corresponding thresholds for the tests 
Cramer-von Mises and Kolmogorov-Smirnov. Note that the both tests are 
consistent against any fixed alternative. 

Our goal is to discuss the possibility of the construction of asymptotically 
distribution free tests for the mentioned in this work threshold diffusion pro- 
cesses. 

Suppose that the basic hypothesis is simple: 

: the observed process X'^ is TOU (i^q) 

i.e., the observations = {Xt, < t < T) come from the equation 

dXt = -p^Xt l{x,<A^}dt - p2Xt l{x,>A^}dt + adWt, < t < T, 

with known and we have to test this hypothesis. We propose below some 
tests of C-vM and K-S types of asymptotic size a. 

Let us denote g (x, 'd) = —pix I{^<^„} — p2X lI{a;>tfo} following [3] in- 
troduce the statistics 



a- 



and 



1 



T r 



Xt — Xq 



sup 



Xf — Xn 



(J\fT 0<t<T 

It is easy to see that under (= in distribution) 

•1 



g (X„ ^o) ds dt, 



giX,,^o)ds 



WisYds, 



Dt = sup \W{s) 

0<s<l 



where W {■) is Wiener process. Hence the tests 



(A-) = I 
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are distribution free. Here the thresholds Cq, da are solutions of the equations 



WisYds > Ca 



a, 



sup \W > da 

0<s<l 



a. 



The both tests are consistent against any fixed alternative.. 
Suppose now that the basic hypothesis is composite: 

J^Q : the observed process 

Let us introduce the statistics 

1 



and 



1 



Xt-Xo - / giXs,^*T)ds 



dt. 



sup 



Xt — Xq 



g{Xs,^*r)ds 



aVr o<t<T 

where is the maximum likelihood or bayesian estimator. Remind, that 
-(9^ = + Using this singular rate of convergence of estimator it can be 
shown that under J^q we have the same limit distributions of the statistics 



W{sfds, 



sup \W (s) 

0<s<l 



Hence the tests ■j/'t {X^) = 2{w2>ca} '^'^ i-^^) ~ ^Wrxia} asymp- 
totically distribution free. These tests as well are consistent against any fixed 
alternative. 

The similar limits we have in the case of general model fllSp . For exam- 
ple, suppose that ■j?o is known and denote the trend coefficient in (|T5|) as 
S {'do,Xt). Then once more (under hypothesis) 



T 



' [dXs~S{i»o,Xs)ds] 
cr{Xs) 



dt 



W{sfds. 



If the basic hypothesis is composite then the same statistic with -Oq replaced 
by one of the estimators (MLE or BE) has this last integral as limit (in 
distribution). 

It is interesting to study the direct analogs of the classical C-vM and 
K-S tests. Let us introduce the empirical distribution function and empirical 
density (local time estimator) 



Ft{x) 



1 
T 



T 



At jx) 
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Then the corresponding C-vM statistics 



W| = T 



V^ = T / [f°{x)-f{^o,x)YdF{^o,x) 

J — oo 

have hmits in distribution but these hmits are not distribution free [TB] . One 
way to have asymptotically distribution free statistic was proposed by Negri 
and Nishiyama [17]. Another possibility (discussed in [H]) is to use the 
weight functions. Let us illustrate the second approach on the statistic 

V^. (t9o) =T r H {'do, x) (Ft (x) - F (^o, x)) ' dF {^o, x) 

with weight function 

H {'do, x) = M (-.^o, x)) . 

^ f{^o,x)[F{^o,x)-lf 

where M(-) is some function providing the finitness of this integral and 
J-oo a[y) to {'&o,y) 

F{'do,y)-lV dy 



F(^o,x)-l7 a{yff{^o,yy 
It is shown that if M (s) = e~* then 

V^W^ / W{sf e-'ds, 
Jo 

where W {■) is a Wiener process, i.e.; we have asymptotically distribution 
free test ipT = ^{v^ (^o)>r } '^^^ threshold r^, of course, is solution of 
the following equation 

W (s)'^ e'"" ds > r„ ^ = a 



The similar result can be proved for the large class of functions M (■) satis- 
fying the obvious conditions. 

In the case of composite hypothesis we can replace i9o by one of the 
estimators, say, to use V|,('j?t) and to have the same (distribution free) limit 
of this statistic. 
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